Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Draft] | GPUAI-3720 - Integrate Universal GEMM into Grouped GEMM - Pt 1 #1800

Draft
wants to merge 9 commits into
base: develop
Choose a base branch
from

Conversation

rtmadduri
Copy link
Contributor

Proposed changes

This PR integrates Universal GEMM into Device Grouped Gemm.
Specifically, we replace:
The GridwiseGemm_bk0mk1_bk0nk1_mn_xdlops_v2r4r2 in device_grouped_gemm_xdl_splitk_cshuffle.hpp with GridwiseGemm_xdl_cshuffle_v3

We make corresponding changes to the struct Argument and struct Invoke

Checklist

Please put an x into the boxes that apply. You can also fill these out after creating the PR. If you're not sure, please don't hesitate to ask.

  • I have added tests relevant to the introduced functionality, and the unit tests are passing locally
  • I have added inline documentation which enables the maintainers with understanding the motivation
  • I have removed the stale documentation which is no longer relevant after this pull request
  • (If this change is user-facing) I have added release notes which provide the end users with a brief summary of the improvement from this pull request
  • I have run clang-format on all changed files
  • Any dependent changes have been merged

Discussion

If this is a relatively large or complex change, feel free to start a discussion by explaining why you chose the solution you did and what alternatives you considered

@rtmadduri rtmadduri force-pushed the rimadduri/universal_gemm_into_grouped_gemm branch from 469a083 to 15a21fc Compare January 7, 2025 19:25
Copy link
Collaborator

@aosewski aosewski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work. But still a bunch of things to do. Keep in mind that CDataType is equivalent of EDataType there's no need to duplicate just pickup one - rather EDataType.
Additionally don't forget to check all neccessary conditions for all gemms and infer from them value that holds for all gemms, like all_have_kbatch_gt_one or all_have_main_kblock_loop this must be checked for all gemms.
Additionally you'd have to calculate tail_number and verify if it's same for all gemms.

@rtmadduri rtmadduri force-pushed the rimadduri/universal_gemm_into_grouped_gemm branch from 4eea3bd to a6f99d8 Compare January 16, 2025 19:06
@rtmadduri rtmadduri requested a review from aosewski January 17, 2025 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants